Which statistics reflect semantics? Rethinking synonymy and word similarity
نویسنده
چکیده
A great deal of work has been done of late on the statistical modeling of word similarity relations (cf.Schütze (1992), Lund and Burgess (1996) Landauer and Dumais (1997), Lin (1998), Turney (2001)). While this has largely been viewed as an engineering task (with the notable exception of much writing on Latent Semantic Analysis (LSA)), the relative success of different approaches to constructing word similarity measures is highly relevant to issues in theoretical semantics and language acquisition. With this background in mind, this paper has two main aims. First, we will present yet another statistical approach to the calculation of word-similarity scores (LC-IR), which significantly outperforms other methods on standard benchmarks including the 80-question set of TOEFL® synonym test items first employed by Landauer and Dumais (1997).1 Second, we hope to demonstrate that
منابع مشابه
Measuring the Degree of Synonymy between Words Using Relational Similarity between Word Pairs as a Proxy
Two types of similarities between words have been studied in the natural language processing community: synonymy and relational similarity. A high degree of similarity exist between synonymous words. On the other hand, a high degree of relational similarity exists between analogous word pairs. We present and empirically test a hypothesis that links these two types of similarities. Specifically,...
متن کاملBehavioral profiles: A corpus-based perspective on synonymy and antonymy*
1 Introduction 1.1 Two empirical perspectives in the study of synonymy and antonymy The domain of linguistics that has arguably been studied most from a corpus-linguistic perspective is lexical, or even lexicographical, semantics. Already the early work of pioneers such as Firth and Sinclair has paved the way for the study of lexical items, their distribution, and what their distribution reveal...
متن کاملSemantics of haq in the Glorious Quran
Meaning plays a very important role at all levels of linguistic analysis and in linguistics. We can say that the word itself and out of the chain of speech doesn’t show the true meaning. It should be in relation with other signs within the language that its meaning be relived. Quran, the precious word of Allah, contains words that take a variety of meanings in the syntactic and topical con...
متن کاملDescriptive Semantics of the Nominal Hapax Legomenon of the Word Menhaj and the Pathology of its Three Translations (Meybodi, Makarem Shirazi and Ansarian)
Understanding the Quran depends upon appreciating meanings of the single words and concepts that are interconnected and interrelated like a chain. Nominal hapax legomenon in the Quran is a word that occurs only once in the holy Quran. Hence, such words need semantic scrutiny since they are difficult to understand. Accordingly, understanding hapax legomenons calls for examining and identifying t...
متن کاملComputing Semantic Relatedness in German with Revised Information Content Metrics
The paper presents an application of information content based metrics to compute semantic relatedness of word senses in German. The main contributions are: an annotation study based on a revised definition of semantic relatedness beyond synonymy, an extension of Resnik’s (1995) procedure for computing information content of concepts for strongly inflected languages, an application of informati...
متن کامل